Relating the thermodynamic arrow of time to the causal arrow. 
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Consider a Hamiltonian system that consists of a slow subsystem S and a fast subsystem F. 
The autonomous dynamics of S is driven by an effective Hamiltonian, but its thermodynamics is 
unexpected. We show that a well-defined thermodynamic arrow of time (second law) emerges for S 
whenever there is a well-defined causal arrow from S to F and the back-action is negligible. This is 
because the back-action of F on S is described by a non-globally Hamiltonian Born-Oppenheimer 
term that violates the Liouville theorem, and makes the second law inapplicable to S. If S and F are 
mixing, under the causal arrow condition they are described by microcanonic distributions P(S) and 
P(S|F). Their structure supports a causal inference principle proposed recently in machine learning. 
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I. INTRODUCTION. 

In this paper we establish a relation between the causal 
arrow — i.e., emergence of a unidirectional interaction be- 
tween two interacting systems — and the thermodynamic 
arrow of time. Studying causation in the context of vari- 
ous physical arrows of time is not a new subject [H, d, Q. 
One of the motivations for these studies is the analogy be- 
tween the temporal asymmetry implied by the thermody- 
namic arrow and the asymmetry between the cause and 
effect: causes influence their effect, but not vice versa, 
and causes can only happen before their effects [J, 0, H[ . 

Causal explanations in everyday-life often construct 
causal structures among phenomena that are not well- 
localized in time (e.g., if one studies relations between 
crime and poverty in social sciences). Even for this kind 
of phenomena we observe sometimes well-defined causal 
connections where one phenomenon is the cause and an- 
other one the effect. For understanding the link be- 
tween thermodynamics and causality within a statistical 
physics setting, it is helpful to study the conditions un- 
der which we can consider one of two interacting systems 
as the cause and the other the as effect. The question is 
then to what extent the unidirectionality of the influence 
is related to the thermodynamics of the two systems. 

The presented results provide some answers to the 
above general question. For describing those answers we 
proceed with separate introductions on the thermody- 
namic arrow and the causal arrow. This section then 
closes with qualitative discussions of our main results. 



formulations of the second law [3, H[ : 

• Entropy formulation: coarse-grained entropy does 
not decrease in time for a closed system that starts 
to evolve from a certain non-equilibrium state 0, 

a i a. 

• Thomson's formulation: for a system that starts to 
evolve from an equilibrium state, no work extrac- 
tion is possible by means of a cyclic process driven 
by an external source of work 0, [1] . 

These statements entail an arrow of time, since they 
refer to the difference between final and initial values of 
the entropy and energy, 1 respectively. Each formula- 
tion has two different aspects: special initial conditions 
(non-equilibrium states for the entropy formulation, equi- 
librium states for Thomson's formulation) and specific 
dynamic features of the system (closed dynamics, cyclic 
processes) . Both these aspect were studied from the first 
principles [J |, 1, S i 0, & 1 ED, M El 2 • 

There are more formulations of the second law, such 
as the minimal wo rk p rinciple 0, [TT|, HI] or the Clausius 
formulation 0, 0, Il0| . Formulations of the second law 
are not always equivalent [T(| [H|- The Thomson and 
entropy formulations do not require anything more than 
a Hamiltonian dynamics that satisfies the Liouville equa- 
tion [1, 0, Q , while the minimum work principle and the 
Clausius formulation do have additional requirements: 
ergodic observable of work for the minimum work princi- 
ple [lf| and weak (or conserved) interaction Hamiltonian 
for the Clausius formulation [J, 0, [T(| • 



A. The thermodynamic arrow of time. 



Thermodynamic arrow of time refers to formulations 
of the second law. The understanding of this law from 
the first principles of quantum or classical dynamics is 
achieved within statistical physics (in contrast to ther- 
modynamics, where the second law is postulated). In 
this statistical physics context we list the following basic 



1 Since any interaction with an external source of work can be 
seen as a thermally isolated process, work is a difference between 
average energies; see section [VI] for details. 

2 The fact that we impose initial, and not final, conditions cannot 
be derived from the first principles. Instead, it should be taken as 
a fact that experiments are described by their initial conditions 
rather than being described by the final conditions. 
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We shall thus focus on the Thomson and entropic for- 
mulations. Here the preference should be given to Thom- 
son's formulation, since there is no universally accepted 
definition of physical entropy for non-equilibrium states. 
In contrast, there is such a definition for work [J, |7[. The 
formulation and derivation of the entropy and Thomson's 
formulation will be recalled below in section fVTl 



B. The causal arrow. 

Causal arrow refers to a dynamical situation when one 
variable C (cause) influences on another E (effect), but 
does not get back-reaction 3 . In this context we shall 
recall two operational definitions of the causal arrow: i) 
Cutting off the interaction between C and E does not 
alter the dynamics of C, while it influences the dynamics 
of E. ii) Perturbing the dynamics of C — e.g., by means 
of external fields, or by changing the initial conditions of 
C — will influence the dynamics of E, while perturbing 
the dynamics of E will not influence on C. 

In studying causal relations (e.g. in economy, 
medicine, social sciences), scientific reasoning often de- 
pends on statistical data that has been obtained from 
mere observations. This is because interventions that 
would prove causal relations are often impossible. One 
then tries to draw plausible causal conclusions merely 
from stochastic dependences in the joint distribution 
function P(C,E) of the variables [T^. In spite of their 
obvious importance — as sometimes our very survival de- 
pends on the proper identification of the cause versus 
effect — such conclusions cannot be always correct, they 
are merely plausible in the sense that they lead to correct 
predictions more frequently than they fail 4 . 

Several criteria are known for this type of causal rea- 
soning, if the number of variables involved in a network 
of possible causal relations is three or more 17}. The 
case of two variables is the most difficult one, since there 
are no widely accepted causal reasoning criteria for this 
situation. For this case it was recently proposed that one 
can plausibly identify C as the cause, if the probability 
distributions P(E|C) and P(C) are in a certain sense sim- 
pler than P(C]E) and P(E), respectively Note that 
the ideas in [19( can be interpreted in the same spirit. 



C. Purposes and results of the present work. 

1. We shall follow in detail how the causal arrow and 
the thermodynamic arrow of time emerge in a closed, 



classical Hamiltonian system that consists of two subsys- 
tems S and F. For the sake of studying causal arrow it is 
natural to assume that S is slow, while F is fast. 

In a more general perspective, the idea of slow ver- 
sus fast variables has been developed in several different 
contexts, e.g., the slaving principle proposed by Haken 
as a cornerstone for s yne rgetics, self-organization, and 
hierarchical dynamics [20(. Indeed, many (almost all?) 
models studied in mechanics, (non)cquilibrium statisti- 
cal physics, chemical kinetics, mathematical ecology, etc, 
are not fundamental, but rather describe effective behav- 
ior of slow degrees of freedom. 

2. The absence of the causal arrow in the above closed 
system is quantified by the back-reaction of F on S. Un- 
der some natural conditions outlined below, this back- 
reaction amounts to an additional (Born-Oppcnheimer 5 ) 
term in the Hamiltonian of S. The dynamics of S is then 
autonomous and energy-conserving. However, the Born- 
Oppenheimer term has the following peculiar feature: it 
depends explicitly on the initial value of the coordinates 
of S that participate in the interaction with F. This is 
a consequence of memory generated during the tracing 
out of the fast variables. Thus there is no single, global 
Hamiltonian for S. We shall show that due to this fact 
the basic formulations of the second law do not apply to 
S, even if we assume the existence of proper initial con- 
ditions. The reasons for this inapplicability are discussed 
in detail in section PVTl The main reason is that the Liou- 
ville theorem (conservation of the phase-space volume) 
does not apply to S. Thus, the usual formulation of the 
thermodynamic arrow of time does not apply to S 6 . 

3. If the Born-Oppcnhcimer term can be neglected for 
the dynamics of S, the applicability of the second law for 
S is recovered. This neglection indicates on the existences 
of the causal arrow in the considered system: S appears 
to be the cause for F. Thus the local thermodynamic 
arrow for S emerges due to the causal arrow. 

Note that the second law applies to the fast subsystem 
F, which has a driven, globally Hamiltonian dynamics. 
Such a dynamics serves as a basis for deriving the second 
law from the first principles 0, 0, H, H, [IH HH • 

4. Another important consequence of the Born- 
Oppenheimer term is that it makes S strongly non- 
ergodic, even if the bare Hamiltonian of S is assumed 
to have ergodic features. [For the employed definition 
or ergodicity see the discussion around (flUl [TTj) ; for the 
precise definition of what do we mean by non-ergodicity 
see the discussion around (|2"Tj) .] Thus no microcanonical 
distribution can be introduced for S, unless the Born- 
Oppenheimer term is neglected. We show that together 



3 By the causal arrow we thus do not mean the macroscopic causal- 
ity that is when the past macro-state determines the future one. 

4 The fact that stochastic dependences cannot serve as the basis 
for drawing unique causal conclusions was stressed by Hume [id . 



5 The names come from the early days of atomic physics, when M. 
Born and R. Oppenheimer calculated in the quantum mechanical 
setting the force exerted by fast electrons on slow nuclei. 

6 This does not mean that there cannot be other — apart from the 
thermodynamic arrow in the sense explained in the introduc- 
tion — temporal asymmetries in the dynamics of S. 
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with the emergence of the causal arrow, there appears a 
natural, microcanonical probability distributions 7 -P(S) 
and P(F|S), where P(S) and P(F|S) are simpler (in the 
precise sense discussed below) than, respectively, P(F) 
and P(S|F). The above simplicity argument for the 
causal reasoning thus gets validated in the present ap- 
proach. 

In section HH we define the system to be studied. Sec- 
tions |in] and IE] discuss, respectively, the dynamics of the 
fast subsystem F and the convergence of its probability 
distribution toward the microcanonic distribution. Dy- 
namics of the slow subsystem S is described in section fVl 
In section IVII we discuss in detail the (in)application of 
the basic statements of the second law (thermodynamic 
arrow) to the dynamics of S. The joint emergence of the 
thermodynamic arrow and the causal arrow is outlined 
in section IVlTl Section IVIIII relates the obtained results 
to the simplicity principle proposed recently in machine 
learning. The last section presents our conclusions and 
offers some speculations. 



II. FAST AND SLOW SUBSYSTEMS. 



The overall Hamiltonian of S + F reads 



W(n, Q, z) = H S {U, Q) + H(z, Q), 



(1) 



where z = (q%, ?j\r;pi, ...,Pn) are canonical coordi- 
nates and momenta of F, and where Q — (Qi, --.,Qm) 
and II = (IIi, --^IIm) are, respectively, canonical coordi- 
nates and momenta of S. The bare Hamiltonian of S is 
H S (H, Q), while H(z, Q) combines the bare Hamiltonian 
of F and the interaction Hamiltonian between S and F. 

Let Tf be the characteristic time of F for the slow vari- 
able Q being fixed [for a more precise definition see after 
(fTTJ)) ]. We shall assume that both Q and Q are slow vari- 
ables with respect to Tf. This assumptions is consistent 
with the fact that the S — F coupling involves only the co- 
ordinate Q of S: according to the Hamiltonian equation, 
Q = <9n[# s (n, Q)}, generated by (JTJ), Q does not depend 
explicitly on the fast variable z. 

Define vq and Vq as the characteristic times over which 
Q and Q change. Denote 



t q = min(^ Q , 



(2) 



Thus our basic assumption on the separated time-scales 
(adiabatic limit) reads 



Tf < T Q . 



(3) 



7 P(F|S) is the conditional probability for the coordinates and mo- 
menta of F, with the variables of S being fixed. 



III. ENERGY OF THE FAST SUBSYSTEM. 

Our intention is to see how the energy H(z, Q) of the 
fast subsystem F changes in time. 

Hamilton's equations of motion for the fast subsys- 
tem imply ^H(z t ,Q t ) = Qtd Q H{z t ,Q t ). Assuming the 
adiabatic limit Tf -C tq, and denoting Q t and z t for 
the time-dependent coordinates, we have for the energy 
change on the intermediate times tq r 3> Tf. 



= -[H(z t+T ,Q t+T )-H(z t ,Q t )] (4) 
r 



t+T dsdH 

— -r-{Zs,Qs) 
t as 

t+T 



(5) 



9* f T dsd Q H(z s ,Q t ) + o( — ), (6) 

T it TQ 



where we took Q t out of the integral, since Qt (together 
with Q t ) is assumed to be a slow variable. 

The last integral in $Q refers to the Q = const dynam- 
ics with Q t = Q. This dynamics has a constant energy 
E = H(z, Qt)- Define for the microcanonic distribution 



M(z,E, Q) 



1 



■6[E-H(z,Q)], 



(7) 



uj(E,Q) 

w(E,Q) = J dz5[E-H(z,Q% (8) 

where ui{E, Q) ensures the proper normalization: 
JdzM(z,E,Q) = l. 

Consider the following obvious relation: 



dzw(z)M(z,E) = - 



t+T 



dzw(z)M(z,E), (9) 



where w{z) = 8qH{z, Qt), and where for simplicity we 
drop the explicit dependence on Q = Q t — const. 

In the RHS of we change the integration variable as 
y = T~t- S z, where % is the flow generated by the Hamilto- 
nian H(z) = H(z, Qt) between times and t. Employing 
Liouville's theorem, dz = dy, and energy conservation, 
M(z, E) = M(y, E), one gets 



dyM(y,E) 



t + T 



dsw{T s ^ t y). 



(10) 



If w(z) is an ergodic observable of the Q t =const dynam- 
ics, then by definition of ergodicity there is such a char- 
acteristic time Tf such that for r ^> 77 the time-average 
in pop depends on the initial condition y only via its en- 
ergy H(y,Q t ) [2]], HI]. Since M{y,E) is proportional to 
a (^-function at E = H(z,Qt), the integration over y in 
(fTT)|) drops out, and we get that the time-average in Q 
is equal to the microcanonical average at the energy E. 
Applying this to the time-average in ^ we get 

^ = ^ j dzd Q H(z,Qt)M(z,E t ,Q t ), (11) 
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where we noted again that Q is a slow variable. 

We define the phase-space volume f2 enclosed by the 
energy shell E: 



VL{E,Q) = j dz6{E- H{z,Q)). 



(12) 



Let us see how Q(E, Q) changes in the slow time: 

d q(e, Q) = d E n\ Q ^ + d Q n\ E dQ 

QT 



dr 

Using fTi 1 IT2] ) we get 

d Q n\ E 



dr 



(13) 



d E tt\ Q 

and then from (fTT | [T3l [M]) 



^(E,Q) = d E 9.\ Q 



fdzd Q H(z,Q t )M(z,E t ,Q t ), (14) 



dE dQ d Q n\ E 



d E n 



dr dr d E Q\ Q 
dE dE' 
dr dr 



0. 



(15) 



Thus, the phase-space volume Q(E, Q) is an adiabatic 
invariant, i.e., it is conserved within the slow dynamics. 
In particular, in the adiabatic limit the points of the fast 
phase-space located initially at the energy shell Ei appear 
on the energy shell Ef, which is found from 



il(Ei,Qi) = n(Et,Qi). 



(16) 



Since by definition (fl"2]) , il(E) is an increasing function of 
E, for given Qi, Qf and Ei the equation (flo) has a unique 
solution 



Et^hiQtlEuQi), 



(17) 



In the adiabatic limit the energy h(Q{ \E- U Qi) of F does 
not depend on the precise phase-space location of the fast 
trajectory on the energy shell Ei. 

Note that the derivation of (flU)) does not demand the 
full ergodicity — which means that all smooth observables 
of F are ergodic — only certain observable is assumed to 
be ergodic [2l[ • The argument expressed by (0 [TUJ) ap- 
plies to calculating the time-average of any ergodic ob- 
servable w(z) of F for a fixed Q. 

The adiabatic invariance of f2 for ergodic systems is 
well known [22|, HH, H3 and motivated the microcanonic 
definition of entropy as In [ 23l |24| . The precision of 
the invariance is studied in [25[ . We presented the above 
derivation for the completeness of this work and for high- 
lighting the two basic assumptions that are not properly 
articulated in literature: i) ergodicity of an observable 
versus the full ergodicity, ii) and the necessity for both 
Q and Q being slow. 



IV. CONDITIONAL MICROCANONIC 
DISTRIBUTION OF THE FAST SUBSYSTEM. 

For describing time-averages of ergodic observables of 
F (see © [TUJ) and the discussion after (fTT|) ) we can em- 
ploy the following time-dependent microcanonic condi- 
tional probability: 



S[h(Q T \Ej,Qj)-H(z,Q T )} 
J dz8[h(Q T \Ei,Qi)-H(z,Q r )Y 



(18) 



Below we explain how to find Q T given the initial energy 
Ei of F, the initial canonical coordinates Qi, H, of S and 
the time r. Note that Pf[z\Qi, Hi] is time-dependent and 
varies with time on the slow time-scale r ~ tq. 

There is another way of introducing the microcanonic 
distribution (11811 which explicitly uses the ensemble de- 
scription 26, 27]. If for a fixed Q the system F is mixing, 
then for any sufficiently smooth initial probability distri- 
bution p(z, 0) of F, the ensemble averages of sufficiently 
smooth (i.e., sufficiently coarse-grained) observables A{z) 
of F converge in time to the averages taken over the (fl8|) 
MM: 



dzp(z,t)A(z) 



dzP f [z\Q,Tl}A(z). 



(19) 



The rate of this convergence defines the mixing time. It 
is more natural (especially for chaotic systems) to de- 
fine observables via ensemble averages than via averages 
over time [26| . If not stated otherwise, from now on we 
assume that F is mixing, and thus the mixing time co- 
incides with t/(<C tq) defined around (TIT)]) . For strongly 
(and homogeneously) chaotic systems the mi xing time is 
inversely proportional to the KS entropy [H, 1271 ]. 



V. DYNAMICS OF THE SLOW SUBSYSTEM. 

Let us average the equations of motion II = 
-d Q [H s (U,Q) +H(Q,z)} and Q = d n [H s (U,Q)} over 
the microcanonic distribution (fT5)) . We get that S is by 
itself a Hamiltonian system: 



^-U = -d Q H s , ^-Q = d n H s , 
dr dr 



(20) 



with an effective Hamiltonian 



n s (U, Q\Qi,Ei) = H S (U, Q) + h(Q\Qi, Ei), (21) 

which is the sum of H S (T1, Q) and the Born-Oppenheimer 
term h(Q\Qi, Ei). In particular, H S (H, Q\Qi, E{) deter- 
mines the actual slow trajectory Q T , given its initial lo- 
cation (ni,Qi). Substituting this back into (fT8|) we thus 
complete the description of F. 

The evolution generated by ([20]) conserves the energy 
TL S . This is the total energy of S + F Note that the 
Born-Oppenheimer term h(Q\Qi, E{) depends on the ini- 
tial coordinate Qi. This means that the points in the 
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phase-space (II, Q) that had initially equal energy (but 
different initial coordinates Q;) will have different ener- 
gies at later times. Thus S is not globally Hamiltonian. 

While this fact seems to be of no special importance 
when we consider a single slow trajectory, it matters 
much for developing statistical physics for S. Indeed, 
there is no global slicing of the phase space into energy 
shells which makes the definition of the microcanonic dis- 
tributions impossible. 

Thus S is non-ergodic: once ergodic systems are char- 
acterized by loosing the memory on the initial phase- 
space location and remembering only the initial energy 
(recall the argument around (JH [lOj)), in the considered 
situation the very form of the energy depends on the ini- 
tial phase-space location. 



Eq. (|13p reduces to the conservation of action: 
E/\Q\ = const, and thus the Born-Oppenhcimer poten- 
tial h(Q\Ei, Qi) reads from ((15)1 



(26) 



h(Q\E i ,Q0 = E i -^±. 



As the simplest example of the bare slow Hamiltonian 
we can take free motion with a mass M: 



Hi 
2M' 



(27) 



Thus the dynamics of the slow subsystem S is described 
by the effective Hamiltonian: TC S = + E- x J^l . Assume 
that Q > and solve the Hamilton equations as: 



A. Liuoville equation and Liuoville theorem. 

A consequence of the non-globally Hamiltonian dy- 
namics is that the Liouville equation and the correspond- 
ing theorem do not hold. With the Hamilton equations 
(120p one can relate a conditional probability 

p«m(n,Q,T|n i ,Q i ,o) 

= 5(U - n(n i7 Q b r)) 5(Q - Q(U h Q u r)), (22) 

where n(Hi, Q;,t) and Q(I\ u Q u t) are the solutions of 
(|20| with initial conditions (li-^Qi). 

As follows from P3 |22j), 7> CO n(n, Q, r|IIi, Qi, 0) docs 
satisfy to the Liouville equation 



drPcon = dQHs 9n^con - dnH s d Q V c 



(23) 



Were Tt 8 not dependent on Qi, the direct integration of 
([23]) with the initial distribution V(Hi, Qi,0) would pro- 
duce the Liouville equation for the unconditional proba- 
bility ^(n, Q, t). But since H S (H, Q\Qi) does depend on 
Qi, the integration with V(Hi,Qi,0) does not lead to a 
differential equation for 7* (II, Q, i). 

Thus the Liouville equation and together with it the 
Liouville theorem (conservation of the phase-space vol- 
ume) do not hold. Below we shall demonstrate this on 
an explicit example. 



B. An example. 

We assume that F and S without mutual coupling are 
two free particles, with masses m and M, respectively. 
The S-F coupling creates a harmonic potential for F: 



m P 2 + 



(24) 



If we regard the slow variable Q as a parameter, F is an 
ergodic system with the characteristic time 



27T\/m 



T f 



Q 



(25) 



H(r)=ni-^, Q(r) 



where the initial time was taken r = 0. The characteristic 
time vq of Q can be estimated from Q(vq) — Qi ~ Q{. 



MQi 
Hi 



'2MQ? 
Ei 



(29) 



For the characteristic time Vq of Q [estimated via 
Q{vq) - Qi ~ Qi] we get 



Vq = QiUi/Ei. 



(30) 



2MQ? 



If Hi — > we should take Vq - 

It is seen now that unless Q(r) ~ 0, the adiabatic 
conditions vq ^> Tf and Vq Tf can be satisfied, e.g., 
for a sufficiently small m and sufficiently large M. 

One now has from (1281) for the Jacobian: 



J(t) 



9(n(r),Q(r)) 
d(Ui,Qi) 



= 1 - 



2MQ\ ' 



(31) 



which is not equal to 1. Moreover, its absolute value can 
be both larger or smaller than one, since it is not difficult 

TP 2 

to see that the conditions Q > and ^mq' 1 > ^ can ^ e 
satisfied together. 

Perhaps the most visible consequence of the absence of 
the Liouville theorem is that the fine-grained entropy 



S 



fg[ 



J dndQ7?(n,Q,T) lnV(U,Q,T), 



(32) 



of the slow subsystem is not anymore constant. Indeed, 
take a small phase-space volume v(0) and assume that 
V(U, Q,0) is constant inside of this volume and equal 
to zero outside. The fine-grained entropy (|32|) is then 
iS/ ff [r] = lnw(r), where v(t) is got from v(0) under ac- 
tion of the flow generated by Hamiltonian ri s . Thus, 

Sfg[ T ] ~ Sfg[®\ — m ^5J = m l^( r )l can both increase 
and decrease in the course of time, as (13~T)) illustrates. 
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When one can neglect the non-conservation of the 
phase-space volume? Taking in (f3"Tj) r ~ t[Q], and going 
in (|29|) to the limit of a small E\ or a large M, we get that 
the non-conservation of the phase-space volume can be 
neglected — though Q still changes significantly — if the 
fast energy Ei is much smaller than the bare slow energy 

uL 

2M ' 



VI. THERMODYNAMIC ARROW FOR THE 
SLOW SUBSYSTEM. 

A. Thomson formulation of the second law. 

How the second law applies to the effectively Hamilto- 
nian, autonomous slow subsystem S? The basic formula- 
tion of the second law is due to Thomson: no work can 
be extracted from initially equilibrium system via a cyclic 
change of an external field. This statement is derived as a 
theorem both in classical and quantum mechanics 0, Q ■ 
We already argued why this formulation is superior to the 
entropy formulation: entropy is not directly observable 
and there is no general consensus on its definition for a 
non-equilibrium state. In contrast, work is directly ob- 
servable, has a clear mechanical meaning, and its general 
definition is universally accepted [J, 0] • Here we focus on 
Thomson's formulation, while the entropic formulation is 
studied below. 

Let us recall the statement of the Thomson formulation 
when no interaction between S and F is present, i.e., the 
dynamics of S is generated by 



H,(r,\r), 



(Q,n). 



(33) 



The interaction of S with an external sources of work is 
described by a time-dependent field A T 0, 0] • 

Let the initial phase-space points are sampled accord- 
ing to the Gibbs distribution: 

p -f3H s (T) r 

V G {T)= e -^—, Z = J&Ye~^ T \ (34) 

where /3 = l/T > is the inverse temperature. A cyclic 
change of the external field means: 



Ao — A r 



A, 



(35) 



where r c is the cycle time. 

For the considered thermally isolated process the work 
is defined as the average energy difference 8 , and the 



Work for a single trajectory (II T , Q T ) is denned as W = 
Jq du d\ u H s (U u ,Q u ,^u) —j • Employing the Hamilton equa- 
tions of motion we get W = H B (U T , Q T , A T ) — H a (11; , Q\ , A; ) , 
where (U T ,Q Tt \ T ) and (11;, Q;, Aj) are the corresponding initial 
and final values. Averaging this expression over the initial and 
final values, and recalling l|35|l . we get the expression of work as 
the average energy difference H36I I. 



statement of the Thomson formulation reads @, @] : 

W= [ dTH s (T,X)[P(T,T c )-V G (T)} >0, (36) 



where V(r,T c ) is the final (at t — r c ) probability dis- 
tribution obtained from the initial Gibbsian probability 
distribution P G (Y) via the Liouville equation with the 
time-dependent Hamiltonian (|33|) . 

The inequality in (f3"6")) is essentially based on three facts 
i) initial and final Hamiltonians are the same due to (|331 
I35|) : ii) the same Hamiltonian appears in the initial Gibbs 
distribution; Hi) the Liouville equation. 

The easiest way to establish the validity of (|36|) is to 
employ the positivity of the relative entropy 0] : 



S[V(t c )\\Vg} = 



dr ^ (r ' Tc)ln W- ' 



(37) 



which holds for any probability distributions V(Y, t c ) and 
V G (T). Employing in (|3"7| the conservation of the fine- 
grained entropy, Sf g [V(T c )] — Sf g [V G ], due to the Liou- 
ville theorem, we get 

(33 = J dr [v G (T) - P(r,r c )] lnVo(r) > 0, (38) 

and then substituting (|34[> into ln'PG(r) in p8|) and re- 
calling (|3"5"|) we arrive at (|3^|l. 

Let us now return to the slow subsystem S coupled to 
F. Now the slow Hamiltonian is given by (j21j) instead 
of (f3"3"| . At the initial time both these Hamiltonians are 
equal modulo a factor E\. We shall assume that the ini- 
tial probability for n and Q is still given by p4]) . while 
initially the fast system always starts with the same en- 
ergy £j. For instance it is described by the microcanonic 
probability distribution (jTSJ) , and then the overall initial 
distribution of S and F is the product of the above spec- 
ified marginal distributions for S and F. 

Thus the overall distribution is not Gibbsian and the 
applicability of the Thomson formulation to the over- 
all system is not automatic. The work is still given by 
the average energy difference (of the slow subsystem, or, 
equivalently, of the total system) calculated via the ef- 
fective slow Hamiltonian ([?!]) . This can be argued for 
exactly in the same way as in Footnote [5] Instead of (|3l)|) 
we now get 

W = J dYH s {Y,X)[V{Y,T c )-V G {Y)] (39) 

+ y"dQdQ i [/ l (Q|Q i ,P i )-P i ]P(Q,r c ;Q i ,0), (40) 

where V(Y, r c ) is the phase-space probability distribution 
at t = t c , while V(Q, t c ; Q 1i 0) is the two-time probabil- 
ity distribution of the coordinate. It is necessary to use 
the two-time distribution, since h(Q\Qi, Ei) explicitly de- 
pends on both initial and final values of the coordinate. 
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Following to the steps outlined after (|37|) we get 

W = T(S[P(r c )\\V G }+S fg [P(r c )}-S fg [P G ]) (41) 
+ J dQ dQ^QlQu Ei) - Ei\V(Q,r e ;Q i ,0),{42) 

where the temperature T comes from (|34[) . The first 
term 7 7 5[7>(t c )||'Pg] in the RHS of flU} is non-negative. 
The fine-grained entropy difference S/gp^Tc)] — S/gp'o] 
does not have definite sign, since the Liouville equa- 
tion does not hold. Moreover, the RHS of (|4Tj) , equal 
to Tj dT[P G (T) - 7 5 (r,T c )]ln7? G (r), does not have a 
definite sign either. Even if the latter term is posi- 
tive — e.g., because the fine-grained entropy increased in 
time: Sf g [V(r c )] > S fg [P G }— the term in J42]) does not 
have any reason to be positive. Apart of special coinci- 
dences, there is no reason why the two "dangerous" terms 
Sf g [P(T c )] — Sf g [P G ] and (|4"2|) would cancel each other. 

Thus the proof of Thomson's formulation can fail two 
times: once because the Liouville equation does not hold, 
and second time because a cyclic change (|35p of the pa- 
rameter A does not yet imply a cyclic change of the Born- 
Oppenheimer term (this is the origin of the term in (!42p V 

The latter aspect can be studied separately. Let S be 
a single particle, and assume the following natural choice 
of the bare slow Hamiltonian: H S (U,Q) = + V(Q), 
where the potential V(Q) has its deepest minimum at 
Qo- V(Q) > V(Q ) for Q ^ Q . In the initial Gibbs 
distribution of S take T = 0. Then the initial distribution 
is reduced to a single initial condition ILi = and Q\ = 
Qo- The interaction of S with external sources of work 
is described by an additional potential u(Q, A T ), which is 
equal to zero both initially and at the end of the cycle; see 
(f3"5| . We assume that at intermediate times u(Q,X T ) is 
such that Qo ceases to be a local minimum of the overall 
potential, i.e., the particle located initially at Qo will 
move out of it and will change its energy. Now for the 
work one has analogously to (f4"Tl |4"2"|) : 

W = H s (II(t c ),Q(t c ))-H s (0,Q q ) (43) 
+ h(Q(T c )\E h Q )-E h (44) 

where II(t c ) and Q(t c ) are the values of the canonical 
coordinates at the end of the cyclic process. They are 
obtained from solving ([20l |2T]) . The term in (|44l) corre- 
sponds to that in (|4"2"| . 

While H s (U(t c ), Q(t c )) — H s (0, Qo) is non-negative by 
construction, there is no general restriction on the sign 
of h(Q(T c )\Ei, Q ) — Ei. Noting the freedom in choosing 
h(Q\Ei, Qi), one can make h(Q(T c )\Ei, Q ) — Ei so nega- 
tive that the overall work is negative as well: W < 0. 



B. Entropic formulation of the second law. 



Coarse-grained entropy of S is defined as 

M 

Sc 9 \t\ =-£>(r fc ,r) In V(T k ,r), (45) 
fe=i 

where T^r^r) is the corresponding one-subsystem dis- 
tribution function. This is the sum of partial entropies for 
each subsystem. The difference 5 cff [r] — <5>/ s [t] between 
the coarse-grained entropy (|45p and fine-grained entropy 
Q32[) is non-negative (sub-additivity) and quantifies the 
relevance of correlations in S @, 0] . 

For additionally motivating the definition (l45l) , we can 
assume that the subsystems of S were interacting for a 
finite time, and that r is larger than this interaction time. 

Note that the definition (|45|) is not the only possibil- 
ity. There are (infinitely) many ways of doing coarse- 
graining, and thus many ways of defining non-equilibrium 
entropy 9 . The main advantage of (|43|) is that allows to 
see the entropy increase due to correlations (which is the 
main qualitative image behind the entropic formulation 
of the second law) To this end assume that initially 

the subsystems of S are independent 

M 

T(T,0) = l[T(r k ,0). (46) 

k=l 

This assumption specifies initial conditions needed for 
the existence of the thermodynamic arrow of time 0, H| • 
If S starts from such a non-equilibrium state, and if 
the fine-grained entropy is constant in time due to the 
Liouville theorem, then one employs sub-additivity to get 
that the coarse-grained entropy is not decreasing in time 

S cg (t) > S fg (t) = S fg (0) = S cg (0). (47) 

However, once the Liouville theorem is not satisfied, Sf g 
can decrease in time and then (|47|) does not hold in gen- 
eral. There are other schemes for deriving the entropic 
formulation of the second law for different sets of initial 
states and for different definitions of the non-equilibrium 
entropy 0, [U, 0, EH • All these derivations essentially use 
the Liouville theorem, so that all of them do not apply 
to the present situation. 

Note that there is a difference between inapplicability 
of the entropic formulation as compared to that of the 
Thomson formulation. Eq. (|4"T]) shows that if the fine- 
grained entropy increases in time, the entropic formula- 
tion is satisfied. In contrast, the increasing fine-grained 
entropy does not yet ensure the validity of the Thomson 
formulation, as we discussed after (|4"2"| . 



The invalidity of the entropic formulation is studied 
along similar lines. Assume that S consists of several sub- 
systems: (n;Q) = (Hi,...,n M ;Q 1 ,... : Q M ) (see Eq. (rrj). 



In particular, one can focus on certain macroscopic observables 
and define their physical, non-equilibrium entropy via maximiza- 
tion of information-theoretic entropy Q . 
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VII. THE CAUSAL ARROW. 

A. Reciprocity versus negligibility of the 
Born-Oppenheimer term. 

All the above anomalies with the second law are due 
to the fact that the Born-Oppenheimer term h{Q\Q u E\) 
makes the dynamics of S not globally Hamiltonian. There 
are two related options for recovering this feature. First 
one can try to see whether the dependence of /i(Q|<3i, E{) 
on Qi can be neglected, h{Q\Q u E\) ~ h{Q\E\), but 
h(Q\Ei) still exerts a sizable force on S. Second, one can 
look for conditions where h(Q\Qi, E{) can be neglected as 
whole. We shall now show that only the second option is 
consistent. 

Employing (IB ) [17 )1 as 



rt(E h Q i )=n(h(Q\E u Qi) > Q), 
and using (5]) we get 



d Ql h(Q\EuQi) = 
d Q h(Q\E- u Q { ) - 



d Qi tt{E u Q0 
u(h(Q\Qi,Ei), QY 
dQ^(E,Q)\ E=h{QlE ^ Q . } 



uj{h(Q\Q u Ei), Q) 



(48) 

(49) 
(50) 



These equations show that there is a certain reci- 
procity — to be guessed already from p^l [T7|) — in the way 
h{Q\Ei, Qi) depends on Q and Qi. 

Let us demand that the Born-Oppenheimer 
term h(Q\Ei, Qi) is independent from Qi. Since 
uj(h(Q\Qi, Ei), Q) is finite, this demand amounts to 
dQ i fl(Ei, Qi) — > for all Ej and Qi. This means requiring 
dQn(E,Q)\ E=h (Q\ Elt Qj -> 0. Due to (50]), this implies 
that h(Q\Qi, Ei) reduces to a constant h(Q\Q- u E{) = Ei, 
and — in addition — the energy of F does not change in 
time. We are thus led to assuming that there is no 
relevant interaction between S and F, a trivial option 
which is definitely out of our interest. 

We are thus left with the second option: for the 
time-scales relevant for the dynamics of S the Born- 
Oppenheimer Hamiltonian h(Q\Qi,E\) in (|2"Tj) is negli- 
gible compared to the bare slow Hamiltonian H S (H, Q). 
For this it is necessary to have: 



H S {U,Q) » h{Q\Q u Ei). 



(51) 



In the absence of the Born-Oppenheimer term, the dy- 
namics driven by H s is globally Hamiltonian, the Liou- 
ville theorem holds, and the second law is applicable to 
S; see the previous sections. 

Using (fl"6| [P7|) and ([8]) one calculates: 



d E MQ\Ei,Qi) = 



e>(Ei,Qi) 



u(h(Q\Q u Ei),Q) 



> 0. 



(52) 



This means that the Born-Oppenheimer term decreases 
with Ei. Since the RHS of @ is normally ~ 0(1), for 
satisfaction of (|5ip we have to require 



We already saw this condition at the end of section IV Bl 
for a particular example. This example also shows that 
there may be situations, where for sufficiently long times 
of the slow motion the Born-Oppehcimer force cannot be 
neglected, even though it is numerically small; see (|3 L [) in 
this context. In addition, there can be time limitations 
related to the validity of the time-scale separation, and 
thus to the definition of the Born-Oppenheimer force; 
see the discussion after (|30)) in this context. Thus, at 
the moment we cannot give a fairly general estimate for 
the times on which the conditions (jSTj) and (|53"|) will be 
sufficient for neglecting the Born-Oppenheimer term. 



B. The causal arrow. 

Eq. (|5Tj) also means that the interaction between S 
and F gets the causal arrow: S (cause) influences on F 
(effect), while F does not influence on S. 

Thus we sec that for the present system, the thermody- 
namic arrow and the causal arrow emerge simultaneously. 
Recall in this context the operational definitions of the 
causal arrow discussed in section II Bl 



VIII. MICROCANONICAL ENSEMBLE AND 
SIMPLICITY PRINCIPLE. 

After neglecting the Born-Oppenheimer term 
h(Q\Qi,Ei) we recover a globally Hamiltonian be- 
havior for the dynamics of S. In particular, the 
time-average of the ergodic observables of S can be 
described by probability distribution: 



P S (T) 



6(U a -H a (T)) 
J dT5{U s -H s (T))' 



(54) 



where U s is the slow energy. Since S does not get back- 
reaction from F, the energy U s is a constant determined 
by the initial conditions for the dynamics of S. 

Recall that the very existence of (|5"4")) is related to ne- 
glecting the back-action of F on S. For the same reason 
the probability distribution (|54|) is unconditional. The 
appearance of (|54")) can be argued following to the lines 
of section HVl In this context we should assume that S 
with the Hamiltonian H s (n, Q) is mixing and define the 
mixing time r s of S. 

The distributions (fT8|) and (|54|) can be combined into 
a non-equilibrium microcanonic ensemble for describing 
the statistics of the overall system S + F on the times 
larger than r s , but smaller than the mixing time r s+ f of 
the overall system: 



P(T,z)=P s (T)P f (z\T) 



(55) 



H 3 (IL,Q)^Ei. 



(53) 



It is understood that Q T needed in (|T8|) for defining 
Pf(z\T) is obtained (for given initial T = (Q,H)) by 
solving the equations of motion (|2H|) for S without the 
Born-Oppenheimer term. 
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Note that P(T, z) in ([55)) can be obtained via 
sequential maximization of the conditional entropy 
— JdzP(z\T) lnP(z|r) of F for fixed slow variables, 
and then maximization of the unconditional entropy 
-/ drP(r)lnP(r) of S for fixed slow energy U s . In 
this context it is not difficult to accept the idea that the 
microcanonic distribution is the simplest (least informa- 
tive) one for a fixed value of energy. 

On the other hand, the probability distributions 
P(T\z) and P(z) — obtained from ([55]) via the Bayes for- 
mula — are not simple. They are not microcanonic, and 
in general they cannot be even obtained in a closed form. 

Recalling that under condition (|51[) we identified S and 
F as the cause and effect, respectively, we get that the 
probability distributions P(S) and P(F|S) are simpler 
than P(F) and P(S|F). As proposed in Ref. [l8|], in causal 
reasoning one should tend to prefer the causal hypothesis 
C — > E (C is the cause, and E is its effect) if the factor- 
ization of P(C, E) into P(C)P(E|C) leads to significantly 
simpler terms P(C) and P(E|C) than the factorization 
into P(E)P(C|E). Thus this simplicity argument for the 
causal reasoning is validated in the present approach. 

The causal arrow persists in the global microcanonic 
equilibrium which if the overall system S + F is mixing 
with a time r s +/ — is established for t ^> t s+ /: 



Pe q (r,z) 



S{£-H s (T)-H{Q,z)) 
J dr dz S{£ - H S (T) ~ H(Q, z)) ' 



(56) 



where £ is the total energy. Eq. (|56|) is a stationary 
distribution. The no-back-action condition (|51|) is now 
substituted by its equilibrium analog 



H a (U,Q) »ff(Q,z). 



(57) 



However, once the slow Hamiltonian is much larger than 
the fast Hamiltonian, we expect that the partial prob- 
ability P eg (n,Q) will be close to P S (IL Q) in (H. In- 
deed, once H(Q, z) is small, the overall energy £ in (p)6"[) 
should be nearly canceled by the bare slow Hamiltonian 
H S (H, Q), so that P eq (H, Q) is proportional to a smeared 
delta-function concentrated at £ — H S (H, Q). For calcu- 
lating observables (at small H(Q, z)) this is the same as 
P eq (U,Q)(xS(£ -H S (U,Q)). 

As for the conditional probability P eq (z\H, Q) = 
P eq {z\T), it can always be written as 



S(£-H a (T)-H(Q,z)) 
fdzS(£-H s (T)-H(Q,z)y 



(58) 



Here £ — H S (T) is, of course, not the Born-Oppenheimer 
energy h{Q T \E\, Q) that shows up in the non-equilibrium 
distribution (JTSJ) - Still £ — H S (T) can be seen as an equi- 
librium analog of h{Q T \E- 11 Q). 



IX. SUMMARY. 

We studied a Hamiltonian system that consists of a 
slow subsystems S and a fast subsystem F; see section [TT1 



The separation into slow versus fast is one of the basic 
ways of defining autonomous systems in natural sciences 
[201 ] . In particular, the effective dynamics of slow sub- 
systems is studied in a great variety of different fields: 
atomic and molecular physics, semi-classic physics (in- 
cluding semi-classic gravity), physical chemistry, syner- 
getics, economics, etc. 

Our main purpose was in relating two seemingly differ- 
ent issues: i) the causal arrow — or unidirectional influ- 
ence — where S influences F, but does not get back-action; 
ii) the thermodynamic arrow of time (second law) for the 
system. Since the applicability of the second law to F is 
well known || [ll[ , we focused on the second law as ap- 
plied to the autonomous, energy conserving, Hamiltonian 
dynamics of S. The presence of F is reflected in the dy- 
namics of S via an additional Born-Oppcnhcimcr term 
in the Hamiltonian of S. This term emerged during the 
tracing out of F, and it depends on the initial coordinate 
of S; see section IVl Thus, different initial coordinates of 
S have different Hamiltonians: the dynamics of S is not 
globally Hamiltonian. The cause of this is that due to 
the time-scale separation the dynamics of F does have 
an adiabatic invariant (effective conservation law); see 
section IIIII 

The specific features of the Born-Oppenheimer term 
make the basic formulations of the second law inapplica- 
ble to the dynamics of S. These statements of the second 
law are i) the Thomson formulation, which states that 
no work can be extracted by means of a cyclic Hamil- 
tonian process (driven by an external source of work) , if 
the initial conditions of S are thermal and ii) entropic for- 
mulation, which claims that the coarse-grained entropy 
of S does not decrease, provided that S starts from a 
low-entropy state. There are two mechanisms for this 
inapplicability. First, the Liouville theorem (i.e., con- 
servation of the fine-grained entropy) does not hold for 
a non-globally Hamiltonian dynamics: the fine-grained 
entropy can both increase or decrease in the course of 
time. The second mechanism is efficient for the Thom- 
son formulation only and has to do with the behavior of 
the Born-Oppenheimer term under a cyclic Hamiltonian 
driving; see section IVTl for details. 

As we argued in section [VII A[ the Born-Oppenheimer 
term has a certain reciprocacy feature. Its basic impli- 
cation for our purposes is that the only way to recover 
a globally Hamiltonian dynamics for S is to neglect the 
Born-Oppenheimer term as compared to the bare Hamil- 
tonian of S. By this we neglect the influence of F on S, 
but, importantly, the influence of S on F is not neglected 
and can be sizable. Once the Born-Oppcnhcimcr term 
can be neglected, the basic formulations of the second 
law naturally apply to S. Thus we see that the emer- 
gence of the thermodynamic arrow (second law) for S is 
closely related to the causal arrow: S acts on F, but does 
not get back-action. 

Finally, in section IVIII1 we studied our results in 
the context of a causal inference principle proposed re- 
cently in machine learning 18]. This principle plausi- 
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bly infers the causal-effect relation between two stochas- 
tic variables, and it intends to cover especially those 
situations, where more standard causal inference pro- 
cedures do not apply. If we assume that S and 
F are mixing systems, under the causal arrow con- 
dition they are described by the microcanonic prob- 
ability distribution -P(S) and the conditional micro- 
canonic distribution P(S|F). Now the factorization 
of the joint probability P(cause = S, effect = F) into 
P(cause) P(effect| cause) leads here to simpler expres- 
sions than the factorization into P (effect) P(causejeffect). 



This is the core of the inference principle proposed in 
(l8| . and we conclude that this principle is validated in 
the present approach. 



Acknowledgments 

The work was supported by Volkswagenstiftung grant 
"Quantum Thermodynamics: Energy and information 
flow at nanoscale". 



[1] H. Reichenbach, The Direction of Time (California Uni- 
versity Press, Berkeley, 1956). 

[2] O. Penrose and I.C. Percival, Proc. Phys. Soc., 79, 605 
(1962). 

[3] L. S. Schulman, Times Arrows and Quantum Measure- 
ment (Cambridge University Press, Cambridge, 1997). 

[4] R. Balian, From Microphysics to Macrophysics, volumes 
I and II (Springer, 1992). 

[5] E.T. Jaynes, Am. J. Phys. 33, 391 (1965). 

[6] H.D. Zeh, The Physical Basis of the Direction of Time 
(Springer, Berlin 2001). 

[7] G. Lindblad, N on- Equilibrium Entropy and Irreversibil- 
ity, (D. Reidel, Dordrecht, 1983). 

[8] I.M. Bassett, Phys. Rev. A 18, 2356 (1978). A. Lenard, J. 
Stat. Phys., 19, 575 (1978). W. Pusz and L. Woronowicz, 
Comm. Math. Phys. 58, 273 (1978). 

[9] D. Janzing, P. Wocjan, R. Zeier, R. Geiss and Th. Beth, 
Int. Jour. Theor. Phys. 39, 2217 (2000). 
[10] A.E. Allahverdyan and Th.M. Nieuwenhuizen, Phys. 

Rev. B, 66, 115309 (2002). 
[11] A.E. Allahverdyan and Th.M. Nieuwenhuizen, Phys. 
Rev. E, 75, 051124 (2007); Phys. Rev. E, 71, 046107 
(2005). 

[12] The entropic analog of the minimum work principle was 
studied in: 

M.Campisi, cond-mat stat-mech/0704256 1; M. Campisi. 
Stud. Hist. Phil. M. P. 36, 275 (2005). 

[13] W. De Roeck, C. Maes and K. Netocny, 
|cond-mat/050 8089 

[14] J. Gemmer, M. Michel and G. Mahler, Quantum Ther- 
modynamics - Emergence of Thermodynamic Behavior 
within Composite Quantum Systems, vol. 657 of Lecture 
Notes in Physics (Springer, Berlin, 2004). 

[15] R. Penrose, J. Stat. P hys. 77, 217 (1994). 
R.M. Wald, |gr-qc/0507094] 

S.M. Carroll and J. Chen, |hep-th/0410"270l and 
|gr-qc/0505037| 

[16] H. Hoffding, History of Modern Philosophy (Dover, NY, 
1955). 

[17] J. Pearl, Causality (Oxford University Press, Oxford, 



2000). 

P. Spirtes, G. Glymour and R. Scheines, Causation, Pre- 
diction, and Search, (Springer- Verlag, New York, 1993). 
[18] X. Sun, D. Janzing, and B. Scholkopf, Causal inference 
by choosing graphs with most plausible Markov kernels, 
in Proceeding of the 9th Int. Symp. Art. Int. and Math., 
Florida, 2006. 

X. Sun, D. Janzing and B. Scholkopf, to appear in Pro- 
ceedings of the European Symposium on Artificial Neural 
Networks (ESANN), 2007. 

X. Sun, D. Janzing, to appear in Proceedings of the 
European Symposium on Artificial Neural Networks 
(ESANN), 2007. 

[19] Y. Kano and S. Shimizu, Causal inference using non- 
normality, in ISM report on Research and Education, 
No 17, pp. 261-270, The Institute of Statistical Mathe- 
matics, Tokyo, Japan, 2003. 

[20] H. Haken, Synergetics, an Introduction: Nonequilibrium 
Phase Transitions and Self-Organization in Physics, 
Chemistry, and Biology (Springer- Verlag, 1983). 
A. Gorban and I. Karlin, Invariant Manifolds for Phys- 
ical and Chemical Kinetics (Lecture Notes in Physics, 
Springer, 2005). 

[21] N. G. van Kampen, Physica 53, 98 (1971). 

[22] P. Hertz, Ann. Phys. (Leipzig) 33, 225 (1910); ibid. 33, 
537. T. Kasuga, Proc. Jpn. Acad. 37, 366 (1961). 

[23] A. Munster, Statistical Thermodynamics (Springer- 
Verlag, Berlin, 1969), Vol. 1. 

R. Becker, Theory of Heat (Springer, New York, 1967). 
V. L. Berdichevsky, Thermodynamics of Chaos and Or- 
der, (Addison Wesley Longman, Essex, England, 1997). 

[24] H.H. Rugh, Phys. Rev. E 64, 055101 (2001). 

[25] E. Ott, Phys. Rev. Lett. 42, 1628 (1979). C. Jarzynski, 
Phys. Rev. Lett. 71, 839 (1993). 

[26] R.Z. Sagdeev, DA. Usikov, and G.M. Zaslavsky, Nonlin- 
ear Physics (Harwood, Philadelphia, 1988). 

[27] A.J. Lichtenberg and M.A. Lieberman, Regular and 
Chaotic Dynamics (Springer- Verlag, New York, 1991). 



